Pesquisa | Portal Regional da BVS

HAPNEST: efficient, large-scale generation and evaluation of synthetic datasets for genotypes and phenotypes.

Wharrie, Sophie; Yang, Zhiyu; Raj, Vishnu; Monti, Remo; Gupta, Rahul; Wang, Ying; Martin, Alicia; O'Connor, Luke J; Kaski, Samuel; Marttinen, Pekka; Palamara, Pier Francesco; Lippert, Christoph; Ganna, Andrea.

Bioinformatics ; 39(9)2023 09 02.

Artigo em Inglês | MEDLINE | ID: mdl-37647640

RESUMO

MOTIVATION: Existing methods for simulating synthetic genotype and phenotype datasets have limited scalability, constraining their usability for large-scale analyses. Moreover, a systematic approach for evaluating synthetic data quality and a benchmark synthetic dataset for developing and evaluating methods for polygenic risk scores are lacking. RESULTS: We present HAPNEST, a novel approach for efficiently generating diverse individual-level genotypic and phenotypic data. In comparison to alternative methods, HAPNEST shows faster computational speed and a lower degree of relatedness with reference panels, while generating datasets that preserve key statistical properties of real data. These desirable synthetic data properties enabled us to generate 6.8 million common variants and nine phenotypes with varying degrees of heritability and polygenicity across 1 million individuals. We demonstrate how HAPNEST can facilitate biobank-scale analyses through the comparison of seven methods to generate polygenic risk scoring across multiple ancestry groups and different genetic architectures. AVAILABILITY AND IMPLEMENTATION: A synthetic dataset of 1 008 000 individuals and nine traits for 6.8 million common variants is available at https://www.ebi.ac.uk/biostudies/studies/S-BSST936. The HAPNEST software for generating synthetic datasets is available as Docker/Singularity containers and open source Julia and C code at https://github.com/intervene-EU-H2020/synthetic_data.

Assuntos

Benchmarking , Confiabilidade dos Dados , Humanos , Genótipo , Fenótipo , Herança Multifatorial

Toward Identification of Functional Sequences and Variants in Noncoding DNA.

Monti, Remo; Ohler, Uwe.

Annu Rev Biomed Data Sci ; 6: 191-210, 2023 08 10.

Artigo em Inglês | MEDLINE | ID: mdl-37262323

RESUMO

Understanding the noncoding part of the genome, which encodes gene regulation, is necessary to identify genetic mechanisms of disease and translate findings from genome-wide association studies into actionable results for treatments and personalized care. Here we provide an overview of the computational analysis of noncoding regions, starting from gene-regulatory mechanisms and their representation in data. Deep learning methods, when applied to these data, highlight important regulatory sequence elements and predict the functional effects of genetic variants. These and other algorithms are used to predict damaging sequence variants. Finally, we introduce rare-variant association tests that incorporate functional annotations and predictions in order to increase interpretability and statistical power.

Assuntos

DNA , Estudo de Associação Genômica Ampla , Genoma , Algoritmos , Regulação da Expressão Gênica

Identifying interpretable gene-biomarker associations with functionally informed kernel-based tests in 190,000 exomes.

Monti, Remo; Rautenstrauch, Pia; Ghanbari, Mahsa; James, Alva Rani; Kirchler, Matthias; Ohler, Uwe; Konigorski, Stefan; Lippert, Christoph.

Nat Commun ; 13(1): 5332, 2022 09 10.

Artigo em Inglês | MEDLINE | ID: mdl-36088354

RESUMO

Here we present an exome-wide rare genetic variant association study for 30 blood biomarkers in 191,971 individuals in the UK Biobank. We compare gene-based association tests for separate functional variant categories to increase interpretability and identify 193 significant gene-biomarker associations. Genes associated with biomarkers were ~ 4.5-fold enriched for conferring Mendelian disorders. In addition to performing weighted gene-based variant collapsing tests, we design and apply variant-category-specific kernel-based tests that integrate quantitative functional variant effect predictions for missense variants, splicing and the binding of RNA-binding proteins. For these tests, we present a computationally efficient combination of the likelihood-ratio and score tests that found 36% more associations than the score test alone while also controlling the type-1 error. Kernel-based tests identified 13% more associations than their gene-based collapsing counterparts and had advantages in the presence of gain of function missense variants. We introduce local collapsing by amino acid position for missense variants and use it to interpret associations and identify potential novel gain of function variants in PIEZO1. Our results show the benefits of investigating different functional mechanisms when performing rare-variant association tests, and demonstrate pervasive rare-variant contribution to biomarker variability.

Assuntos

Exoma , Mutação de Sentido Incorreto , Exoma/genética , Estudos de Associação Genética , Marcadores Genéticos , Humanos , Canais Iônicos/genética , Sequenciamento do Exoma

The ectomycorrhizal fungus Pisolithus microcarpus encodes a microRNA involved in cross-kingdom gene silencing during symbiosis.

Wong-Bajracharya, Johanna; Singan, Vasanth R; Monti, Remo; Plett, Krista L; Ng, Vivian; Grigoriev, Igor V; Martin, Francis M; Anderson, Ian C; Plett, Jonathan M.

Proc Natl Acad Sci U S A ; 119(3)2022 01 18.

Artigo em Inglês | MEDLINE | ID: mdl-35012977

RESUMO

Small RNAs (sRNAs) are known to regulate pathogenic plant-microbe interactions. Emerging evidence from the study of these model systems suggests that microRNAs (miRNAs) can be translocated between microbes and plants to facilitate symbiosis. The roles of sRNAs in mutualistic mycorrhizal fungal interactions, however, are largely unknown. In this study, we characterized miRNAs encoded by the ectomycorrhizal fungus Pisolithus microcarpus and investigated their expression during mutualistic interaction with Eucalyptus grandis. Using sRNA sequencing data and in situ miRNA detection, a novel fungal miRNA, Pmic_miR-8, was found to be transported into E. grandis roots after interaction with P. microcarpus Further characterization experiments demonstrate that inhibition of Pmic_miR-8 negatively impacts the maintenance of mycorrhizal roots in E. grandis, while supplementation of Pmic_miR-8 led to deeper integration of the fungus into plant tissues. Target prediction and experimental testing suggest that Pmic_miR-8 may target the host NB-ARC domain containing transcripts, suggesting a potential role for this miRNA in subverting host signaling to stabilize the symbiotic interaction. Altogether, we provide evidence of previously undescribed cross-kingdom sRNA transfer from ectomycorrhizal fungi to plant roots, shedding light onto the involvement of miRNAs during the developmental process of mutualistic symbioses.

Assuntos

Basidiomycota/genética , Inativação Gênica , MicroRNAs/metabolismo , Micorrizas/genética , Simbiose/genética , Sequência de Bases , Basidiomycota/crescimento & desenvolvimento , Contagem de Colônia Microbiana , Perfilação da Expressão Gênica , Regulação Fúngica da Expressão Gênica , Genoma Fúngico , MicroRNAs/genética , Raízes de Plantas/microbiologia , RNA Mensageiro/genética , RNA Mensageiro/metabolismo

Deep learning for genomics using Janggu.

Kopp, Wolfgang; Monti, Remo; Tamburrini, Annalaura; Ohler, Uwe; Akalin, Altuna.

Nat Commun ; 11(1): 3488, 2020 07 13.

Artigo em Inglês | MEDLINE | ID: mdl-32661261

RESUMO

In recent years, numerous applications have demonstrated the potential of deep learning for an improved understanding of biological processes. However, most deep learning tools developed so far are designed to address a specific question on a fixed dataset and/or by a fixed model architecture. Here we present Janggu, a python library facilitates deep learning for genomics applications, aiming to ease data acquisition and model evaluation. Among its key features are special dataset objects, which form a unified and flexible data acquisition and pre-processing framework for genomics data that enables streamlining of future research applications through reusable components. Through a numpy-like interface, these dataset objects are directly compatible with popular deep learning libraries, including keras or pytorch. Janggu offers the possibility to visualize predictions as genomic tracks or by exporting them to the bigWig format as well as utilities for keras-based models. We illustrate the functionality of Janggu on several deep learning genomics applications. First, we evaluate different model topologies for the task of predicting binding sites for the transcription factor JunD. Second, we demonstrate the framework on published models for predicting chromatin effects. Third, we show that promoter usage measured by CAGE can be predicted using DNase hypersensitivity, histone modifications and DNA sequence features. We improve the performance of these models due to a novel feature in Janggu that allows us to include high-order sequence features. We believe that Janggu will help to significantly reduce repetitive programming overhead for deep learning applications in genomics, and will enable computational biologists to rapidly assess biological hypotheses.

Assuntos

Aprendizado Profundo , Genômica/métodos , Animais , Biologia Computacional , Processamento Eletrônico de Dados , Humanos

The regulatory and transcriptional landscape associated with carbon utilization in a filamentous fungus.

Wu, Vincent W; Thieme, Nils; Huberman, Lori B; Dietschmann, Axel; Kowbel, David J; Lee, Juna; Calhoun, Sara; Singan, Vasanth R; Lipzen, Anna; Xiong, Yi; Monti, Remo; Blow, Matthew J; O'Malley, Ronan C; Grigoriev, Igor V; Benz, J Philipp; Glass, N Louise.

Proc Natl Acad Sci U S A ; 117(11): 6003-6013, 2020 03 17.

Artigo em Inglês | MEDLINE | ID: mdl-32111691

RESUMO

Filamentous fungi, such as Neurospora crassa, are very efficient in deconstructing plant biomass by the secretion of an arsenal of plant cell wall-degrading enzymes, by remodeling metabolism to accommodate production of secreted enzymes, and by enabling transport and intracellular utilization of plant biomass components. Although a number of enzymes and transcriptional regulators involved in plant biomass utilization have been identified, how filamentous fungi sense and integrate nutritional information encoded in the plant cell wall into a regulatory hierarchy for optimal utilization of complex carbon sources is not understood. Here, we performed transcriptional profiling of N. crassa on 40 different carbon sources, including plant biomass, to provide data on how fungi sense simple to complex carbohydrates. From these data, we identified regulatory factors in N. crassa and characterized one (PDR-2) associated with pectin utilization and one with pectin/hemicellulose utilization (ARA-1). Using in vitro DNA affinity purification sequencing (DAP-seq), we identified direct targets of transcription factors involved in regulating genes encoding plant cell wall-degrading enzymes. In particular, our data clarified the role of the transcription factor VIB-1 in the regulation of genes encoding plant cell wall-degrading enzymes and nutrient scavenging and revealed a major role of the carbon catabolite repressor CRE-1 in regulating the expression of major facilitator transporter genes. These data contribute to a more complete understanding of cross talk between transcription factors and their target genes, which are involved in regulating nutrient sensing and plant biomass utilization on a global level.

Assuntos

Parede Celular/metabolismo , Proteínas Fúngicas/metabolismo , Neurospora crassa/genética , Pectinas/metabolismo , Polissacarídeos/metabolismo , Fatores de Transcrição/metabolismo , Biocombustíveis , Biomassa , Repressão Catabólica , Parede Celular/química , Regulação Fúngica da Expressão Gênica , Engenharia Metabólica/métodos , Redes e Vias Metabólicas/genética , Neurospora crassa/metabolismo , RNA-Seq

A switch in transcription and cell fate governs the onset of an epigenetically-deregulated tumor in Drosophila.

Torres, Joana; Monti, Remo; Moore, Ariane L; Seimiya, Makiko; Jiang, Yanrui; Beerenwinkel, Niko; Beisel, Christian; Beira, Jorge V; Paro, Renato.

Elife ; 72018 03 21.

Artigo em Inglês | MEDLINE | ID: mdl-29560857

RESUMO

Tumor initiation is often linked to a loss of cellular identity. Transcriptional programs determining cellular identity are preserved by epigenetically-acting chromatin factors. Although such regulators are among the most frequently mutated genes in cancer, it is not well understood how an abnormal epigenetic condition contributes to tumor onset. In this work, we investigated the gene signature of tumors caused by disruption of the Drosophila epigenetic regulator, polyhomeotic (ph). In larval tissue ph mutant cells show a shift towards an embryonic-like signature. Using loss- and gain-of-function experiments we uncovered the embryonic transcription factor knirps (kni) as a new oncogene. The oncogenic potential of kni lies in its ability to activate JAK/STAT signaling and block differentiation. Conversely, tumor growth in ph mutant cells can be substantially reduced by overexpressing a differentiation factor. This demonstrates that epigenetically derailed tumor conditions can be reversed when targeting key players in the transcriptional network.

Assuntos

Diferenciação Celular/genética , Transformação Celular Neoplásica/genética , Drosophila melanogaster/genética , Epigênese Genética , Perfilação da Expressão Gênica , Animais , Animais Geneticamente Modificados , Transformação Celular Neoplásica/patologia , Proteínas de Ligação a DNA/genética , Proteínas de Drosophila/genética , Drosophila melanogaster/citologia , Drosophila melanogaster/embriologia , Regulação da Expressão Gênica no Desenvolvimento , Redes Reguladoras de Genes , Larva/citologia , Larva/genética , Mutação , Complexo Repressor Polycomb 1/genética , Proteínas Repressoras/genética , Transdução de Sinais/genética

Limb-Enhancer Genie: An accessible resource of accurate enhancer predictions in the developing limb.

Monti, Remo; Barozzi, Iros; Osterwalder, Marco; Lee, Elizabeth; Kato, Momoe; Garvin, Tyler H; Plajzer-Frick, Ingrid; Pickle, Catherine S; Akiyama, Jennifer A; Afzal, Veena; Beerenwinkel, Niko; Dickel, Diane E; Visel, Axel; Pennacchio, Len A.

PLoS Comput Biol ; 13(8): e1005720, 2017 Aug.

Artigo em Inglês | MEDLINE | ID: mdl-28827824

RESUMO

Epigenomic mapping of enhancer-associated chromatin modifications facilitates the genome-wide discovery of tissue-specific enhancers in vivo. However, reliance on single chromatin marks leads to high rates of false-positive predictions. More sophisticated, integrative methods have been described, but commonly suffer from limited accessibility to the resulting predictions and reduced biological interpretability. Here we present the Limb-Enhancer Genie (LEG), a collection of highly accurate, genome-wide predictions of enhancers in the developing limb, available through a user-friendly online interface. We predict limb enhancers using a combination of >50 published limb-specific datasets and clusters of evolutionarily conserved transcription factor binding sites, taking advantage of the patterns observed at previously in vivo validated elements. By combining different statistical models, our approach outperforms current state-of-the-art methods and provides interpretable measures of feature importance. Our results indicate that including a previously unappreciated score that quantifies tissue-specific nuclease accessibility significantly improves prediction performance. We demonstrate the utility of our approach through in vivo validation of newly predicted elements. Moreover, we describe general features that can guide the type of datasets to include when predicting tissue-specific enhancers genome-wide, while providing an accessible resource to the general biological community and facilitating the functional interpretation of genetic studies of limb malformations.

Assuntos

Elementos Facilitadores Genéticos/genética , Extremidades/crescimento & desenvolvimento , Genômica/métodos , Crescimento e Desenvolvimento/genética , Software , Animais , Genoma/genética , Aprendizado de Máquina , Camundongos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA